-
Notifications
You must be signed in to change notification settings - Fork 1.9k
ATM: Implement the current endpoint filters as EndpointCharacteristics #11281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...experimental/adaptivethreatmodeling/modelbuilding/extraction/ExtractEndpointDataTraining.qll
Fixed
Show fixed
Hide fixed
Also disambiguate three filters from three different sink types that all have the same name, "not a direct argument to a likely external library call or a heuristic sink".
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Fixed
Show fixed
Hide fixed
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Fixed
Show fixed
Hide fixed
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Fixed
Show fixed
Hide fixed
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Fixed
Show fixed
Hide fixed
…` to the base class. They can now be implemented generically for all sink types.
...ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/ATMConfig.qll
Fixed
Show fixed
Hide fixed
...ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/ATMConfig.qll
Fixed
Show fixed
Hide fixed
This is needed because we changed the names of three endpoint filters that were all called "not a direct argument to a likely external library call or a heuristic sink" in order to disambiguate them (fc56c5a).
…tFeatures` overrides it.
adityasharad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I can see the existing filtering logic translated into the new object-oriented hierarchy, and that you've taken care to preserve that logic.
My suggestions are mostly about the clarity and performance of the QL code, rather than the semantics of the filters. Most are not blocking but I hope will be useful.
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Outdated
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/StandardEndpointFilters.qll
Outdated
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Outdated
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Show resolved
Hide resolved
...ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/ATMConfig.qll
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love love love how this is remolding logic that used to be really hard to manage into something more maintainable. Things are really starting to fall into place! Thank you :)
The one complaint I have about this is how the StandardEndpointFilterCharacteristic class is being made magic by its usage in other modules. The bright side is that there's very easy fixes that I have outlined in my comments. Addressing these should be doable in 20-30min.
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Outdated
Show resolved
Hide resolved
...experimental/adaptivethreatmodeling/modelbuilding/extraction/ExtractEndpointDataTraining.qll
Show resolved
Hide resolved
...experimental/adaptivethreatmodeling/modelbuilding/extraction/ExtractEndpointDataTraining.qll
Show resolved
Hide resolved
...experimental/adaptivethreatmodeling/modelbuilding/extraction/ExtractEndpointDataTraining.qll
Show resolved
Hide resolved
...pt/ql/experimental/adaptivethreatmodeling/test/endpoint_large_scale/FilteredTruePositives.ql
Show resolved
Hide resolved
...experimental/adaptivethreatmodeling/test/endpoint_large_scale/FilteredTruePositives.expected
Show resolved
Hide resolved
...pt/ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/XssATM.qll
Show resolved
Hide resolved
...ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/ATMConfig.qll
Show resolved
Hide resolved
...l/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointCharacteristics.qll
Show resolved
Hide resolved
...ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/ATMConfig.qll
Outdated
Show resolved
Hide resolved
Make `SyntacticHeuristics` an explicit import
|
I'm still working on this review, trying to figure out what the real problem is here and coming up with a specific solution. The PR's interdependencies are quite complex, which is why this has taken a little. |
|
I think I'm through with the review. 🤡 The PR sadly reproduces some of the confusing definitions we currently have in I'm a bit sick and will call it a day now. ✅ I consider the review of this PR done and approved, assuming you are making the |
|
(Minor request: Do you mind using words rather than emoji? I don't remember off the top of my head what each emoji was meant to signify, and I had to spend some time searching for the slack thread to look up what 🤡 means) |
| // Don't surface endpoint filters as characteristics, because they were previously not surfaced. | ||
| // TODO: Experiment with surfacing these to the modeling code by removing the following line (and then make | ||
| // EndpointFilterCharacteristic private). | ||
| not characteristic instanceof EndpointFilterCharacteristic and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see why your commit didn't fail endpoint_large_scale/ExtractEndpointDataTraining.qlref: It's because it didn't remove this line. If we were to raise the confidence of the endpoint filters without having this line, the training set would include all non-effective-sinks, which would be bad, because many of those are in fact low-confidence characteristics.
Do you see a point in making StandardEndpointFilterCharacteristic private without removing this line or making EndpointFilterCharacteristic private?
IMO, I think we need to be consistent: Either we use subclasses of EndpointCharacteristics to replicate the existing data for now, or we try to improve the logic at the same time that we improve the code, and each PR requires end-to-end testing. I definitely prefer the former. The nice thing is that, once we're ready to progress to the second phase, even without my TODO comments and issue tracking, we can easily look at all EndpointCharacteristics that aren't private and find the places they're used in the code, and that will tell us where we want to make improvements. Ultimately all EndpointCharacteristics should be private, with subclasses used only for two things: (1) Abstract classes are used to apply the same set of implications to many different subclasses without needing to overwrite getImplications() over and over. (2) Non-abstract classes are used to define the labels for type balancing through getEndpoints().
Unfortunately I don't think I can do that 😞. Or rather, I can do it by introducing a new confidence level, but that would further hide ugliness in a way that will make it harder to fix later. I know this back-and-forth is frustrating, and I'm sure a quick sync would be an easier way to reach resolution, but we can do that when you're feeling better -- please don't push yourself!
Thank you! ❤️
No rush! There's a chance that one is trickier than I realized. Luckily, if we do need big changes on that one, there's only one PR following it in the chain, so it won't require big changes to a long chain of subsequent PRs. Anyway, Thursday and Friday are bank holidays in the US, so unless that PR is unexpectedly ready for approval and merging tomorrow, it won't get merged before Monday. More importantly, get lots of rest and get healthy! |
Thank you! I agree, there's no point rushing this. I see several ways forward, we only need to find one that works for both of us. And enjoy your time off! |
Select endpoints to score at inference time base purely on their confidence level, and not on whether they fit the historical definition of endpoint filters.
Endpoint filters added commits
|
@kaeluka I merged this change we agreed on into this PR, and updated the predicate's documentation accordingly. I think this addresses all your concerns, unless I lost track of something else in the long discussion 😅 Sending this back to you for final(?) review 🏓 |
...ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/ATMConfig.qll
Fixed
Show fixed
Hide fixed
I've been working with Brits for too long :)
kaeluka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳
👍
📢 This PR is a bit bigger, so commit-by-commit review might be easier. I've tried to make the commit comments as informative as possible.
Main changes
EndpointCharacteristics.isEffectiveSinkandgetAReasonSinkExcludedto the base class, as they can now be implemented generically for all sink types.getAReasonSinkExcludedis how we'd adjust which endpoints we score at inference time. For now I've implemented it to replicate the logic in the old code, so that results remain unaffected. I've tracked possible experiments to improve this selection in https://github.com/github/ml-ql-adaptive-threat-modeling/issues/2126.A few notes
Note that this PR still sticks to the principle of not breaking any tests, except that I had to disambiguate three filters from three different sink types that all had the same name (fc56c5a), and that required a tiny update to
FilteredTruePositives.expected(0fd013f).Also note that the training data is unaffected because (for now) I've given all EndpointFilterCharacteristics medium confidence, whereas only high-confidence characteristics contribute to training set selection. AIUI, the reason endpoint filters weren't used to select negative training samples in the old code was precisely this: their accuracy is high enough that we don't want to waste inference time scoring these endpoints, but not high enough that we can reliably use them as negative training samples. It's worth having someone with the needed expertise (Stephan? 😉) go through them eventually to consider whether any should be promoted to high confidence. I tracked this possible experiment in https://github.com/github/ml-ql-adaptive-threat-modeling/issues/2126.
Timing checks
endpoint_large_scale/ExtractEndpointDataTrainingremains like it was after the last PR: About 5s.Closes https://github.com/github/ml-ql-adaptive-threat-modeling/issues/2100
Probably also closes https://github.com/github/ml-ql-adaptive-threat-modeling/issues/2101?